Less is more: optimal learning with subsampling regularization∗
نویسنده
چکیده
In this talk, we discuss recent results on common techniques for scaling up nonparametric methods such as kernel methods and Gaussian processes. In particular, we focus on data dependent and independent sub-sampling methods, namely Nystrom and random features, and study their generalization properties within a statistical learning theory framework. On the one hand we show that these methods can achieve optimal learning errors while being computational efficient. On the other hand, we show that subsampling can be seen as a form of regularization, rather than only a way to speed up computations. [Joint work with Raffaello Camoriano, Alessandro Rudi.]
منابع مشابه
Less is More: Nyström Computational Regularization
We study Nyström type subsampling approaches to large scale kernel methods, and prove learning bounds in the statistical learning setting, where random sampling and high probability estimates are considered. In particular, we prove that these approaches can achieve optimal learning bounds, provided the subsampling level is suitably chosen. These results suggest a simple incremental variant of N...
متن کاملManifold regularization based on Nystr{ö}m type subsampling
In this paper, we study the Nyström type subsampling for large scale kernel methods to reduce the computational complexities of big data. We discuss the multi-penalty regularization scheme based on Nyström type subsampling which is motivated from well-studied manifold regularization schemes. We develop a theoretical analysis of multi-penalty least-square regularization scheme under the general ...
متن کاملRegularization in Statistics
This paper is a selective review of the regularization methods scattered in statistics literature. We introduce a general conceptual approach to regularization and fit most existing methods into it. We have tried to focus on the importance of regularization when dealing with today’s high-dimensional objects: data and models. A wide range of examples are discussed, including nonparametric regres...
متن کاملNYTRO: When Subsampling Meets Early Stopping
Early stopping is a well known approach to reduce the time complexity for performing training and model selection of large scale learning machines. On the other hand, memory/space (rather than time) complexity is the main constraint in many applications, and randomized subsampling techniques have been proposed to tackle this issue. In this paper we ask whether early stopping and subsampling ide...
متن کاملBig Learning with Bayesian Methods
The explosive growth in data volume and the availability of cheap computing resources have sparked increasing interest in Big learning, an emerging subfield that studies scalable machine learning algorithms, systems and applications with Big Data. Bayesian methods represent one important class of statistical methods for machine learning, with substantial recent developments on adaptive, flexibl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015